Classification in an informative sample subspace
نویسندگان
چکیده
We have developed an Informative Sample Subspace (ISS) method that is suitable for projecting high dimensional data onto a low dimensional subspace for classification purposes. In this paper, we present an ISS algorithm that uses a maximal mutual information criterion to search a labelled training dataset directly for the subspace’s projection base vectors. We evaluate the usefulness of the ISS method using synthetic data as well as real world problems. Experimental results demonstrate that the ISS algorithm is effective and can be used as a general method for representing high dimensional data in a low-dimensional subspace for classification.
منابع مشابه
Image Classification via Sparse Representation and Subspace Alignment
Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...
متن کاملUSING DISTRIBUTION OF DATA TO ENHANCE PERFORMANCE OF FUZZY CLASSIFICATION SYSTEMS
This paper considers the automatic design of fuzzy rule-basedclassification systems based on labeled data. The classification performance andinterpretability are of major importance in these systems. In this paper, weutilize the distribution of training patterns in decision subspace of each fuzzyrule to improve its initially assigned certainty grade (i.e. rule weight). Ourapproach uses a punish...
متن کاملRandom Subspace Method with Feature Subsets Selected by a Fuzzy Class Separability Index
Classifier combining techniques have become popular for improving weak classifiers in recent years. The random subspace method (RSM) is an efficient classifier combining technique that can improve classification performance of weak classifiers for the small sample size (SSS) problems. In RSM, the feature subsets are randomly selected and the resulting datasets are used to train classifiers. How...
متن کاملStratified sampling for feature subspace selection in random forests for high dimensional data
For high dimensional data a large portion of features are often not informative of the class of the objects. Random forest algorithms tend to use a simple random sampling of features in building their decision trees and consequently select many subspaces that contain few, if any, informative features. In this paper we propose a stratified sampling method to select the feature subspaces for rand...
متن کاملClassifying Very High-Dimensional Data with Random Forests Built from Small Subspaces
The selection of feature subspaces for growing decision trees is a key step in building random forest models. However, the common approach using randomly sampling a few features in the subspace is not suitable for high dimensional data consisting of thousands of features, because such data often contains many features which are uninformative to classification, and the random sampling often does...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition
دوره 41 شماره
صفحات -
تاریخ انتشار 2008